Mixed Precision Dense Linear System Solvers for High Performance Reconfigurable Computing
نویسندگان
چکیده
The iterative refinement method for linear system solvers can improve performance while maintaining numeric accuracy. Previous work addressing iterative refinement exploits single precision and double precision for CPU, GPU, or Cell/BE processors. Due to only two different precisions supported, iterative refinement is limited on those platforms. Reconfigurable Computing (RC) is a great candidate to exploit iterative refinement since it is able to employ any precision computation as long as the hardware resources are sufficient. In iterative refinement for RC, the choice of working precision for Gaussian elimination is extremely important since its computational complexity is O(n) while the other steps are O(n). In this paper, we explore RC architecture and working precision for Gaussian elimination to obtain both high performance and satisfactory numerical solutions.
منابع مشابه
Mixed Precision Comparison in Reconfigurable Systems
Customisable data formats provide an opportunity for exploring trade-offs in accuracy and performance of reconfigurable systems. This paper introduces a novel methodology for mixed-precision comparison, which improves comparison performance by using reduced-precision datapaths while maintaining accuracy by using high-precision datapaths. Our methodology adopts reduced-precision data-paths for p...
متن کاملEfficient Parallel Solvers for Large Dense Systems of Linear Interval Equations
Verified solvers for dense linear (interval-)systems require a lot of resources, both in terms of computing power and memory usage. Computing a verified solution of large dense linear systems (dimension n > 10000) on a single machine quickly approaches the limits of today’s hardware. Therefore, an efficient parallel verified solver for distributed memory systems is needed. In this work we prese...
متن کاملHigh Performance Computing Benchmark Tool for Parallel Processing of Large Models
Benchmarks for parallel processing of large models is an urgent need for High Performance Computing (HPC) as today’s model size reaches millions of degrees of freedom. Explicit solvers as in the case of crash dynamics or fluid dynamics do not require matrix based equation solvers and inherently exhibit good scalability on large numbers of processors. Where as analysis requiring implicit solvers...
متن کاملBreakthroughs in Sparse Solvers for GPUs
The CUDA Center of Excellence (CCOE) at UTK targets the development of innovative algorithms and technologies to tackle challenges in Heterogeneous High Performance Computing. Over the last year, the CCOE at UTK developed CUDA-based breakthrough technologies in sparse solvers for GPUs. Here, we describe the main ones – a sparse iterative solvers package, a communication-avoiding (CA) sparse ite...
متن کاملEnergy Footprint of Advanced Dense Numerical Linear Algebra using Tile Algorithms on Multicore Architecture
We propose to study the impact on the energy footprint of two advanced algorithmic strategies in the context of high performance dense linear algebra libraries: (1) mixed precision algorithms with iterative refinement allow to run at the peak performance of single precision floating-point arithmetic while achieving double precision accuracy and (2) tree reduction technique exposes more parallel...
متن کامل